Purification of recombinant proteins fused to multiple epitopes

ABSTRACT

The present invention provides novel identification polypeptides containing multiple copies of an antigenic domain joined in tandem to provide increased sensitivity for the detection and purification of target peptides, a cleavable linking sequence and optionally a spacer domain. Further provided are hybrid polypeptide molecules composed of an identification polypeptide and a target peptide which are produced by recombinant DNA technology and purified using affinity chromatography using one or more ligands. Accordingly, also provided are DNA expression vectors containing DNA encoding for identification polypeptides and methods for using such identification polypeptides for the purification of target peptides. Also provided are methods of constructing DNA vectors encoding the novel identification polypeptides and DNA expression vectors encoding the identification polypeptides linked to a target peptide.

REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.09/970,308, filed Oct. 3, 2001, which is a division of U.S. patentapplication Ser. No. 09/415,000, filed Oct. 8, 1999, the entire contentsof which are incorporated herein by reference

FIELD OF INVENTION

The present invention relates to protein tags and methods of proteinpurification using various recombinant DNA techniques. Moreparticularly, the invention is directed to novel identificationpolypeptides and DNA vectors encoding novel identification polypeptidescontaining multiple antigenic domains joined in tandem. Also providedare methods for using such identification polypeptides for thepurification of target peptides and methods of constructing DNA vectorsencoding the novel identification polypeptides and DNA expressionvectors encoding the identification polypeptides linked to a targetpeptide.

BACKGROUND OF THE INVENTION

Proteinaceous molecules such as enzymes, hormones, storage proteins,binding proteins, transport proteins and signal transduction proteinsmay be produced and purified using various recombinant DNA techniques.For instance, DNA fragments coding for a selected protein, together withappropriate DNA sequences for a promoter and ribosome binding site areligated to a plasmid vector. The plasmid is inserted within a hostprokaryotic or eukaryotic cell. Transformed host cells are identified,isolated and then cultivated to cause expression of the proteinaceousmolecules. One method used to purify hybrid polypeptides is thepoly-arginine system in which a hybrid polypeptide is selectivelypurified on a cation exchange resin. See Sassenfeld, H. M. and Brewer,S. J. BioTechnology, 2:76 (1984); U.S. Pat. No. 4,532,207. Sassenfeldand Brewer reported a carboxy-terminal extension of five arginineresidues fused to a target protein. This basic polyarginine extensionallowed the purification of the hybrid polypeptide on a SP-Sephadexresin. An analogous protein expression and purification system employs apolyhistidine tract or tag at either the amino- or carboxy-terminus ofthe hybrid polypeptide. The fusion protein is purified by chromatographyon a Ni²⁺ metal affinity resin. See Porath, J., Protein Expression andPurification, 3:7995 (1992).

Additionally, various affinity purification protocols are currentlyemployed to facilitate the isolation of fusion proteins. Affinitychromatography is based on the capacity of proteins to bind specificallyand noncovalently with a ligand. Used alone, it can isolate proteinsfrom very complex mixtures with not only a greater degree ofpurification than possible by sequential ion-exchange and gel columnchromatography, but also without significant loss of activity.Typically, a ligand capable of binding with high specificity to anaffinity matrix is chosen as the fusion partner. For example,p-aminophenyl-β-D-thiogalactosidyl-succinyldiaminohexyl-Sepharoseselectively binds to β-galactosidase allowing the purification of β-galfusion proteins. See Germino et al., Proc. Natl. Acad. Sci. USA 80:6848(1983). Other expression systems which permit the affinity purificationof fusion proteins include fusion proteins made withglutathione-S-transferase, which are selectively recovered onglutathione-agarose. See Smith, D. B. and Johnson, K. S. Gene 67:31(1988). IgG-Sepharose can be used to affinity purify fusion proteinscontaining staphylococcal protein A. See Uhlen, M. et al. Gene 23:369(1983). The maltose-binding protein domain from the malE gene of E. colihas been used as a fusion partner and allows the affinity purificationof the fusion protein on amylose resins.

Another method used to detect and isolate proteins is by use of anepitope tag. Epitope tagging utilizes antibodies against guest peptidesto study protein localization at the cellular level and subcellularlevels. See Kolodziej, P. A. and Young, R. A., Methods Enzymol.,194:508-519 (1991). Using recombinant DNA technology, a sequence ofnucleotides encoding the epitope is inserted into the coding region ofthe cloned gene, and the hybrid gene is introduced into a cell by amethod such as transformation. When the hybrid gene is expressed theresult is a chimeric protein containing the epitope as a guest peptide.If the epitope is exposed on the surface of the protein, it is availablefor recognition by the epitope-specific antibody, allowing theinvestigator to observe the protein within the cell usingimmunofluorescence or other immunolocalization techniques. Further,fusion proteins labeled with such epitope tags are frequently used forpurifying proteins utilizing affinity purification techniques.

Thus, epitope tagging has become a powerful tool for the detection andpurification of expressed proteins. See Kolodziej, P. A. and Young, R.A., Methods Enzymol., 194:508-519 (1991). Many types of tags have beenused, with c-myc and FLAG® tags being two of the most popular epitopesused. See Evan et al., Mol Cell Biol. 5:3610-3616 (1985). Generally,these epitopes are fused to the amino or carboxy-terminus of theexpressed protein making them more accessible to the antibody fordetection and less likely to cause severe structural or functionalperturbations.

Fusion proteins having the FLAG® octapeptideAsp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1) at the amino-terminus canbe affinity purified on an immuno-affinity resin containing an antibodyspecific for the octapeptide, See Hopp, T. P., et al. Biotechnology,6:1204 (1988); Prickett, K. S., et al., BioTechniques, 7:580 (1989); andU.S. Pat. No. 4,851,341. The FLAG® epitope tag has been effectively usedto detect and purify protein in mammalian and bacterial systems. Theoriginal FLAG® sequence is recognized by two antibodies, M1, M2, and aFLAG® sequence with an initiator methionine attached is recognized by athird antibody, M5. The last five amino acids of the FLAG® sequence is arecognition site for the protease enterokinase, thus, allowing forremoval of the FLAG® epitope. The FLAG® epitope has been used in variousexpression systems for detection and purification of heterologousproteins e.g., in E. coli (Brizzard et al., BioTechniques, 16:730-735(1994)), Saccharomyces cerevisiae (Lee et al., Nature, 372:739-746(1994); Prickett et al., BioTechniques, 7:580-589 (1989)), Drosophila(Xu et al., Development, 117:1223-1237 (1993)), Baculovirus (Dent etal., Mol. Cell Biol, 15:4125-4135 (1995); Ritchie et al., BiochemJournal, 338:305-10 (1999)), and mammalian systems (Overholt et al.,Clin. Cancer Res., 3:185-191 (1997); Schulte am Esch et al.,Biochemistry, 38:2248-2258 (1999)). However, in many mammalianexpression systems, protein expression levels are low and effectivedetection of expressed foreign proteins using established methods can bedifficult.

There is therefore a need for an epitope tag and expression systememploying such epitope tags which would allow for increased sensitivityand detection of recombinant proteins.

SUMMARY OF THE INVENTION

The present invention addresses one or more of the foregoing problems byproviding methods and vehicles which can be used to produce high yieldsof recombinant proteins. Accordingly, among the several objects of thepresent invention may be noted the provision of a novel identificationpolypeptide, a hybrid molecule composed of a target peptide fused to thenovel identification polypeptide and recombinant DNA vectors encodingthe same. Also provided are methods for the purification of the targetpeptide wherein a single ligand or multiple ligands, preferablyantibodies may be employed to isolate and purify substantially allprotein molecules expressed by transformed host cells, whether antigenicor not. A further object of the present invention is to provideprocesses which can be used to highly purify any protein moleculeproduced by recombinant DNA methods, including those that are notsusceptible to affinity chromatography procedures.

Briefly, therefore, the present invention is directed to anidentification polypeptide comprising multiple copies of an antigenicdomain joined together in tandem. The identification polypeptide maycontain a linking sequence containing a cleavable site located adjacentto the target peptide wherein the cleavable site is not located in orinterposed between the individual antigenic domains. Each antigenicdomain is capable of eliciting an antigenic response and can be bound bya ligand, preferably an antibody. Further, each antigenic domain iscomprised of a combination of at least two, preferably three or moredifferent amino acids.

Also provided are fusion proteins of the present invention comprisingthe novel identification polypeptide fused to a target peptide. Theidentification polypeptide contains a linking sequence which ischaracterized by being cleavable at a specific amino acid residueadjacent to the target peptide by use of a sequence specific proteolyticagent. Such cleavable site is located adjacent to either thecarboxy-terminus or amino-terminus of the target peptide, preferablylocated immediately adjacent to the amino-terminus of the targetpeptide. Ideally, the amino acid sequence of the cleavable site isunique, thus minimizing the possibility that the proteolytic agent willcleave the target peptide. In a preferred embodiment, the cleavable sitecomprises amino acids specific for enterokinase, thrombin or a FactorXa.

In accordance with this particular construct of the fusion protein, thetarget peptide may be isolated by affinity chromatography techniques.Thus, it is an object of the invention to provide methods for thepurification of the target peptide. This is accomplished by constructingan affinity column with immobilized ligands specific for the antigenicdomains of the identification polypeptide thereby binding the fusionprotein. It will be appreciated that by virtue of the present invention,a singular antibody or multiple antibodies may be used to bind to theindividual antigenic domains comprising the multiple antigenic domainsof the identification polypeptide. Then the bound fusion protein can beliberated from the column and the identification polypeptide cleavedwith an appropriate proteolytic agent, thus releasing a purified targetpeptide. In a preferred embodiment, the proteolytic agent used to cleavethe target peptide from the identification polypeptide is selected fromthe group consisting of enterokinase, thrombin and Factor Xa.

A further object of the present invention is to provide a recombinantcloning vector containing DNA encoding for the identificationpolypeptide. The vector encoding for the identification polypeptide alsoincludes DNA sequences coding for a multiple cloning site comprised ofmultiple restriction enzyme sites which may be located between theantigenic domains or on either side of the antigenic domains which willenable one skilled in the art to insert any number of DNA sequencesencoding for any desired protein. This DNA sequence may be insertedwithin a cloning vector such as a plasmid, by use of appropriaterestriction endonucleases and ligases. The recombinant plasmid isemployed to transform compatible prokaryotic or eukaryotic host cellsfor replication of the plasmid and expression of the hybrid affinitydomain/protein molecule. Ideally, the plasmid has a phenotypic markergene for identification and isolation of transformed host cells. In apreferred embodiment, DNA sequences encoding for a secreted signalpeptide will be joined either to the DNA vector or to the plasmid thusenabling the transformed host cells to be readily identified andseparated from cells which do not undergo transformation.

Other objects and features will be in part apparent and in part pointedout hereinafter.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A is a depiction of a DNA (SEQ ID NOS: 15 and 16) and protein (SEQID NO: 17) sequence of 3XFLAG-CMV-7 multiple cloning site and FLAG®sequences.

FIG. 1B is a plasmid map of the p3XFLAG-CMV-7 showing the CMV promoter,human growth hormone transcription termination and polyadenylation site,SV40 origin of replication, Col E1 origin of replication, andβ-lactamase gene.

FIG. 2A is a vector map of p3xFLAG-CMV-7-BAP showing insertion of thephoA coding region into P3xFLAG-CMV-7.

FIG. 2B is a vector map of p3XFLAG-ATS-BAP showing insertion of the phoAcoding region into pFLAG-ATS-BAP.

FIG. 3 is a Western Blot of purified 3XFLAG-BAP (A) and N-FLAG-BAP (B)using anti-FLAG M2 antibody. Lane(s) (1) 0.5 ng; (2) 1.0 ng; (3) 2.0 ng;(4) 5.0 ng; and (5) 10 ng. Amounts shown are the amounts loaded onto thegel before transfer.

ABBREVIATIONS AND DEFINITIONS

To facilitate understanding of the invention, a number of terms aredefined below:

The nucleotide bases are abbreviated herein as follows:

-   -   A represents adenine;    -   C represents cytosine;    -   G represents guanine;    -   T represents thymine; and    -   U represents uracil.

The amino acid residues are abbreviated herein according to their singleletters:

-   -   A represents alanine;    -   R represents arginine;    -   N represents asparagine;    -   D represents aspartic acid;    -   C represents cysteine;    -   Q represents glutamine;    -   E represents glutamic acid;    -   G represents glycine;    -   H represents histidine;    -   I represents isoleucine;    -   L represents leucine;    -   K represents lysine;    -   M represents methionine;    -   F represents phenylalanine;    -   P represents proline;    -   S represents serine;    -   T represents threonine;    -   W represents tryptophan;    -   Y represents tyrosine; and    -   V represents valine.

The term “recombinant DNA molecule” as used herein refers to a DNAmolecule which is comprised of segments of DNA joined together by meansof recombinant DNA technology.

The term “expression vector” as used herein refers to nucleic acidsequences containing a desired coding sequence and appropriate nucleicacid sequences necessary for the expression of the operably linkedcoding sequence in a particular host organism. Nucleic acid sequencesnecessary for expression in prokaryotes include a promoter, a ribosomebinding site, an initiation codon, a stop codon, optionally an operatorsequence and possibly other sequences. Eukaryotic cells utilizepromoters, a Kozak sequence and often enhancers and polyadenlyationsignals. Prokaryotic cells also utilize a Shine-Dalgarno Ribosomebinding site. The present invention includes vectors or plasmids whichcan be used as vehicles to transform any viable host cell with therecombinant DNA expression vector.

The term “FLAG” as used herein is the registered trademark that refersto the widely used FLAG® epitope tag consisting of the synthetic peptidesequence of DYKDDDDK (SEQ ID NO: 1) as described in U.S. Pat. Nos.4,703,004, 4,782,137, 4,851,341 and 5,011,912 incorporated herein byreference.

The term “hydrophilic” when used in reference to amino acids refers tothose amino acids which have polar and/or charged side chains.Hydrophilic amino acids include lysine, arginine, histidine, aspartate(i.e., aspartic acid), glutamate (i.e., glutamic acid), serine,threonine, cysteine, tyrosine, asparagine and glutamine.

The term “hydrophobic” when used in reference to amino acids refers tothose amino acids which have nonpolar side chains. Hydrophobic aminoacids include valine, leucine, isoleucine, cysteine and methionine.Three hydrophobic amino acids have aromatic side chains. Accordingly,the term “aromatic” when used in reference to amino acids refers to thethree aromatic hydrophobic amino acids phenylalanine, tyrosine andtryptophan.

The term “cleavable site” refers to a defined amino acid sequence thatallows cleavage of a protein or peptide containing this sequence by aselective proteolytic agent.

The term “fusion protein” as used herein refers to a hybrid polypeptidewhich comprises protein domains from at least two different proteins.The target peptide may be located at the amino-terminal portion of thefusion protein or at the carboxy-terminal protein thus forming an“amino-terminal fusion protein” or a “carboxy-terminal fusion protein”,respectively.

The term “target peptide” as used herein refers to the peptide whoseexpression is desired within the hybrid polypeptide. In the hybridpolypeptides of the invention, the target peptide may comprise eitherthe amino- or carboxy-terminal portion of the hybrid polypeptide.

The term “endoprotease” or “endopeptidase” as used herein refers to aprotease capable of hydrolyzing interior peptide bonds of a polypeptide,at points other than the terminal bonds (i.e., the peptide bonds of theterminal amino acid).

The terms “nucleic acid molecule encoding,” “DNA sequence encoding,” and“DNA encoding” refer to the order or sequence of deoxyribonucleotidesalong a strand of deoxyribonucleic acid. The order of thesedeoxyribonucleotides determines the order of amino acids along thepolypeptide (protein) chain. The DNA sequence thus codes for the aminoacid sequence.

For all the nucleotide and amino acid sequences disclosed herein, it isunderstood that equivalent nucleotides and amino acids can besubstituted into the sequences without affecting the function of thesequences. Such substitution is within the ability of a person ofordinary skill in the art.

The procedures disclosed herein which involve the molecular manipulationof nucleic acids are known to those skilled in the art. See generallyFredrick M. Ausubel et al. (1995), “Short Protocols in MolecularBiology,” John Wiley and Sons, and Joseph Sambrook et al. (1989),“Molecular Cloning, A Laboratory Manual,” 2d ed., Cold Spring HarborLaboratory Press, as incorporated herein by reference.

DETAILED DESCRIPTION

In accordance with the present invention, provided are identificationpolypeptides with increased sensitivity for the detection andpurification of target peptides which are produced using recombinant DNAtechnology. Further provided are hybrid polypeptide molecules composedof an identification polypeptide and a target peptide which are producedby recombinant DNA technology and purified using affinity chromatographyusing one or more ligands. Accordingly, also provided are DNA expressionvectors which include segments of DNA encoding for the identificationpolypeptide and the desired target peptide.

In accordance with the present invention, a target peptide may becomposed of any proteinaceous substance that can be expressed intransformed host cells. The increased antigenicity of the identificationpolypeptide results from the presence of multiple copies of an antigenicdomain in tandem. The identification polypeptide also contains acleavable linking sequence which joins the identification polypeptide totarget peptide thus producing a hybrid polypeptide. The DNA cloningvectors may be replicated and the hybrid polypeptide composed of theidentification polypeptide and a target peptide is expressed inprokaryotic or eukaryotic cells transformed with the vector. Thetransformed cells are isolated and then expanded in culture or othermeans known in the art.

Hybrid polypeptide molecules of the present invention may be purified byaffinity chromatography. The hybrid polypeptide molecule comprising theidentification polypeptide and target peptide may be purified using anaffinity resin which binds to the antigenic domains of theidentification polypeptide. Generally, ligands specific to the antigenicdomains of the identification polypeptide tag are immobilized on a beadcolumn or other type of matrix. An extract of the host cells made fromthe culture is applied to the column and then the polypeptides that bindto the column are eluted. Thereafter, the identification polypeptide iscleaved from the target peptide molecule with an appropriate proteolyticagent thereby releasing the target peptide in a highly purified state.

Identification Polypeptide

The identification polypeptide of the present invention is a sequence ofamino acid residues flanking the amino or carboxy terminus of a targetpeptide. In general, the identification polypeptide includes multiplecopies of an antigenic domain, where each antigenic domain is capable ofbinding an antibody, a cleavable linking sequence to join the antigenicdomains to the target peptide, and optionally one or more spacers.

To increase the detection sensitivity, the identification polypeptidepreferably contains multiple copies of an antigenic domain, i.e., atleast two copies of an antigenic domain, preferably at least threecopies of an antigenic domain and, in some embodiments, four or morecopies of an antigenic domain. The ability of the sequence of multipleantigenic domains to bind to an antibody immobilized, for example, on acolumn or other matrix enables the isolation and purification of targetpeptides.

Each antigenic domain of the sequence of multiple antigenic domainspreferably comprises no more than about twenty amino acid residues, morepreferably no more than about fifteen amino acid residues, even morepreferably no more than about ten amino acid residues, and still morepreferably no more than about six amino acid residues in total. Inaddition, each antigenic domain preferably comprises at least two, morepreferably at least three different amino acid residues, preferablyselected from among hydrophilic and aromatic amino acids. Whilenonaromatic, hydrophobic amino acid residues need not be excluded fromthe antigenic domains, it is generally preferred that at least one-halfof the amino acid residues constituting the antigenic domains beselected from among hydrophilic and aromatic amino acids, still morepreferred that at least one-half of the amino acid residues constitutingthe antigenic domains be hydrophilic amino acids, and still furtherpreferred that at least three-fourths of the amino acid residuesconstituting the antigenic domains be hydrophilic amino acid residues.In one preferred embodiment, the amino acid residues constituting theantigenic domains are one half hydrophilic amino acids and one halfaromatic amino acids. In another preferred embodiment, the amino acidresidues constituting the antigenic domains are selected fromhydrophilic amino acids.

In one preferred embodiment of the present invention, each antigenicdomain is defined by a series of about six to about ten amino acidresidues comprising residues of at least three different amino acidswith at least one being selected from the group of aromatic amino acidsand at least one being selected from the group of hydrophilic aminoacids and with the number of hydrophilic amino acid residuesconstituting at least 50%, more preferably at least 75% of the totalnumber of amino acid residues defining the antigenic domain. Hydrophilicamino acids are preferred as they are more likely to be exposed on theprotein surface thus resulting in increased accessibility to theantibody. See Hopp T. P. and Woods K. R., Proc. Natl. Acad. Sci.,78:3824-3828 (1981). Optionally, this sequence may include one or morenon-aromatic, hydrophobic residues.

In another embodiment of the present invention, the amino acids of eachantigenic domain may be selected from charged or polar amino acidresidues. Jin et al. have shown that the amino acids side chains ofarginine, proline, glutamic acid, aspartic acid, phenylalanine andisoleucine play a dominant role in the functional epitope of humangrowth hormone (hGH). See Jin et al., J. Mol. Biol. 116:851-865 (1992)as incorporated herein by reference. Additionally, Jin et al. have shownthat binding of the epitope to monoclonal antibodies are dominated by asmall number of amino acid side chains in the epitope and are oftencharged or polar amino acid side chains. See Jin et al, supra.Accordingly, designing the antigenic domain using amino acid residuesselected from arginine, proline, glutamic acid, aspartic acid,phenylalanine and isoleucine may increase the surface accessibility ofthe identification polypeptide. See Benjamin et al., Annu. Rev. Immunol.2:101 (1984); Novotny et al., Proc. Nat. Acad. Sci., 83:226-230 (1986);Alzai et al., Annu. Rev. Immunol. 6:555-580 (1988); Davies et al., Annu.Rev. Biochem., 59:439-473 (1990).

The identification polypeptide includes a cleavable linking sequence tolink the sequence of antigenic domains to the target peptide. Ingeneral, the amino acid residues comprising the linking sequence maycomprise any amino acid sequence which would serve to connect thesequence of antigenic domains to the target peptide. Furthermore, thelinking sequence contains a cleavage site which comprises a unique aminoacid sequence cleavable by use of a sequence-specific proteolytic agent.Once the hybrid polypeptide composed of the identification polypeptideand the target peptide has been purified from the culture extract, theidentification polypeptide is preferably cleaved from the target peptideby digestion with a proteolytic agent specific for the amino acids ofthe cleavage site. Alternatively, the identification polypeptide may beremoved from the target peptide by chemical cleavage using methods knownto the art.

In general, the cleavable site may be located at the amino or carboxyterminus of the target peptide. Preferably, the cleavable site isimmediately adjacent the target peptide to enable separation of thetarget peptide from the identification polypeptide. This cleavable sitepreferably does not appear in or interposed between the antigenicdomains or if present, the spacer domains of the identificationpolypeptide. In a preferred embodiment, the cleavable site is located atthe amino terminus of the target peptide. If the cleavable site islocated at the amino terminus of the target peptide and if there areremaining extraneous amino acids on the target peptide after cleavagewith the proteolytic agent, an endopeptidase such as trypsin,clostropain or furin may be utilized to remove these remaining aminoacids, thus resulting in a highly purified target peptide.

Digestion with a proteolytic agent may occur while the hybridpolypeptide is still bound to the affinity resin or alternatively, thehybrid polypeptide may be eluted from the affinity resin and thendigested with the proteolytic agent in order to further purify thetarget peptide. The efficiency of the proteolytic agent or the chemicalcleavage of the recombinant target peptide is determined by the aminoacid sequence of the linking sequence interposed between the sequence ofantigenic domains and the target peptide.

Ideally, the amino acid sequence of the cleavage site is unique, thusminimizing the possibility that the proteolytic agent will cleave thetarget peptide. In a preferred embodiment, the cleavable site comprisesamino acids for an enterokinase, thrombin or a Factor Xa cleavage site.

Enterokinase recognizes several sequences: Asp-Lys; Asp-Asp-Lys;Asp-Asp-Asp-Lys (SEQ ID NO: 2); and Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 3).See Matsushima et al., J. Biochem 125:947-51 (1999). The only knownnatural occurrence of Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 3) is in theprotein trypsinogen which is a natural substrate for bovine enterokinaseand some yeast proteins. As such, by interposing a linking sequencecontaining the amino acid sequence Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 3) asa cleavable site between the sequence of antigenic domains and the aminoterminus of the target peptide, the target peptide can be liberated fromthe identification polypeptide by use of bovine enterokinase with verylittle likelihood that this enzyme will cleave any portion of the targetpeptide itself.

Thrombin cleaves on the carboxy-terminal side of arginine in thefollowing sequence: Leu-Val-Pro-Arg-Gly-X (SEQ ID NO: 4), where X is anon-acidic amino acid. See Chang Eur. J. Biochem., 151:217 (1985).Factor Xa protease (i.e., the activated form of Factor X) cleaves afterthe Arg in the following sequences: Ile-Glu-Gly-Arg-X (SEQ ID NO: 5),Ile-Asp-Gly-Arg-X (SEQ ID NO: 6), and Ala-Glu-Gly-Arg-X (SEQ ID NO: 7),where X is any amino acid except proline or arginine. A fusion proteincomprising the 31 amino-terminal residues of the cII protein, a FactorXa cleavage site and human β-globin was shown to be cleaved by Factor Xaand generate authentic β-globin. See Nagai, K. and Thogersen, H. C.,Nature, 308: 810-812 (1984). A limitation of the Factor Xa-based fusionsystems is the fact that Factor Xa has been reported to cleave atarginine residues that are not present within in the Factor Xarecognition sequence. See Nagai et al., Prot. Expr. and Purif., 2:372(1991).

While less preferred, other unique amino acid sequences for othercleavable sites may also be employed in the linking sequence withoutdeparting from the spirit or scope of the present invention. Forinstance, the linking sequence can be composed in part of a pair ofbasic amino acids, i.e., Lys, Arg or His. This sequence is cleaved bykallikreins, a glandular enzyme. Also, the linking portion can be inpart composed of Arg-Gly, since it is known that the enzyme thrombinwill cleave after the Arg if this residue is followed by Gly. Further,it is not required that the antigenic domains and the cleavable site beexclusive of one another. Antibodies may be able to bind to amino acidsfound in the cleavable site such as is the case with the FLAG®monoclonal antibody M2 which recognizes part of the cleavable siteAsp-Asp-Asp-Asp-Lys (SEQ ID NO: 3) for enterokinase.

While it is generally preferred that each antigenic domain beimmediately adjacent to another antigenic domain (i.e., no interveningsequences), the antigenic domains may be separated from one another by aspacer domain. A spacer domain may also be inserted between the multiplecopies of the antigenic domain and the linking sequence. The insertionof a spacer domain preferably does not result in the insertion of asecond copy of the cleavable site between the antigenic domains of theidentification polypeptide. It is preferred that the number of aminoacid residues in each spacer domain be minimal, preferably consisting ofno more than ten amino acid residues, more preferably, no more thanabout six amino acid residues, and still more preferably two or even oneamino acid residue(s) in length.

If a spacer domain is employed, it may be designed to impart one or moredesired properties to the identification polypeptide. In one embodiment,the amino acid(s) of spacer domain are selected from among hydrophilicamino acids to increase the hydrophilic character of the identificationpolypeptide. Alternatively, the amino acid(s) of the spacer domain maybe selected to impart a desired folding to the identificationpolypeptide thereby increasing accessability to the antibody; forexample, the spacer domain may comprise glycine residues which resultsin a protein folding conformation which allows for improvedaccessibility to the antibody. See Dan et al., J. Bio. Chem.271:30717-30724 (1996); Borjigin, J. and Nathans, J., J. Biol. Chem.269:14715-147622 (1994).

It is well known in the art that certain amino acid residues such ashistidine have an affinity to bind or chelate immobilized metal ions.Accordingly, designing an identification polypeptide with a metalchelating sequence composed of multiple or alternating histidineresidues in the spacer domain or flanking either side of the sequence ofantigenic domains would allow the hydrid polypeptide to bind to a metalion immobilized on a resin or other matrix. In a preferred embodiment, ametal chelating sequence flanking the multiple copies of the antigenicdomain or in a spacer domain may comprise at least one histidineresidue, at least one glycine residue or a combination of alternating ormultiple histidine residues of the formula: -(His-X)_(m)—, wherein m is1 to 6 and X is selected from the group consisting of Gly, His, Tyr,Trp, Val, Leu, Ser, Lys, Phe, Met, Ala, Glu, Ile, Thr, Asp, Asn, Gln,Arg, Cys, and Pro, which may be used in affinity purification techniquesusing a Ni²⁺ binding metal resin. See, for example, U.S. Pat. Nos.4,569,794, 5,310,663, 5,284,933 and 5,594,115 which are incorporatedherein by reference. Preferably, the amino acids of the spacer domain donot include a second copy of the cleavable site as described herein.Once the hybrid polypeptide is bound to the metal resin, the hybridpolypeptide can be released by protonation of its associated metalion-binding ligand. Dissociation is achieved by lowering the pH of thesurrounding buffer medium, a common method known in the art for elutingbound proteins.

In one embodiment of the present invention, the identificationpolypeptide comprises multiple copies of an antigenic domain generallycorresponding to the FLAG® peptide sequence joined to a linking sequencecontaining a single enterokinase cleavage site. Such identificationpolypeptide generally corresponds to the sequence:

-   -   X²⁰—(X1-Y-K—X²—X³-D-X⁴)n-X⁵—(X¹—Y—K—X⁷—X⁸-D-X⁹—K)—X²¹    -   where:        -   D, Y and K are their representative amino acids;        -   X²⁰ and X²¹ are independently a hydrogen or a bond;        -   each X¹ and X⁴ is independently a bond or at least one amino            acid residue, if other than a bond, preferably at least one            amino acid residue selected from the group consisting of            aromatic amino acid residues and hydrophilic amino acid            residues, more preferably at least one hydrophilic amino            acid residue, and still more preferably at least one            aspartate residue;        -   each X², X³, X⁷ and X⁸ is independently an amino acid            residue, preferably an amino acid residue selected from the            group consisting of aromatic amino acid residues and            hydrophilic amino acid residues, more preferably a            hydrophilic amino acid residue, and still more preferably an            aspartate residue;        -   X⁵ is a bond or a spacer domain comprising at least one            amino acid, if other than a bond, preferably a histidine            residue, a glycine residue or a combination of multiple or            alternating histidine residues, said combination comprising            His-Gly-His, or -(His-X)_(m)—, wherein m is 1 to 6 and X is            selected from the group consisting of Gly, His, Tyr, Trp,            Val, Leu, Ser, Lys, Phe, Ala, Glu, Ile, Thr, Asp, Asn, Gln,            Arg, Cys, and Pro;        -   X⁹ is a bond or D; and        -   n is at least 2.

In this embodiment, the amino acid sequence X²⁰—(X¹—Y—K—X²—X³-D-X⁴)_(n)represents the multiple copies of antigenic domain —X¹—Y—K—X²—X³-D-joined in tandem which are joined to a linking sequence(X¹—Y—K—X⁷—X⁸-D-X⁹—K). The antigenic domains may be immediately adjacentto each other when X⁴ is a bond, optionally, X⁴ may be a spacer domaininterposed between the multiple copies of antigenic domains. The linkingsequence contains a single enterokinase cleavable site which isrepresented by the sequence —X⁷—X⁸-D-X⁹—K, where X⁷ and X⁸ may be anamino acid residue or a bond and X⁹ is a bond or an aspartate residue.In a preferred embodiment, each X⁷, X⁸ and X⁹ is independently anaspartate residue thus resulting in the enterokinase cleavable siteDDDDK (SEQ ID NO: 3) which is preferably located immediately adjacent tothe amino terminus of the target peptide. The multiple copies ofantigenic domains may be immediately adjacent to the linking sequencewhen X⁵ is a bond, optionally, X⁵ may be a spacer domain interposedbetween the linking sequence and the antigenic domains. When each X⁴ andX⁵ is independently a spacer domain, it is preferred that the amino acidresidue(s) of each X⁴ and X⁵ impart one or more desired properties tothe identification polypeptide; for example, the amino acids of thespacer domain may be selected to impart a desired folding to theidentification polypeptide thereby increasing accessibility to theantibody. In another preferred embodiment, the amino acids of the spacerdomain X⁴ and X⁵ may be selected to impart a desired affinitycharacteristic such as a combination of multiple or alternatinghistidine residues capable of chelating to an immobilized metal ion on aresin or other matrix. Furthermore, these desired properties may bedesigned into other areas of the identification polypeptide; forexample, the amino acids represented by X² and X³ may be selected toimpart a desired peptide folding or a desired affinity characteristicfor use in affinity purification.

In a more preferred embodiment, the identification polypeptide comprisesmultiple copies of an antigenic domain, a linking sequence containing asingle enterokinase cleavage site and generally corresponds to thesequence:

-   -   X²⁰-(D-Y—K—X²—X³-D)n-X⁵-(D-Y—K—X⁷—X⁶-D-X⁹—K)—X²¹    -   where:        -   D, Y, K are their representative amino acids;        -   X²⁰ and X²¹ are independently a hydrogen or a bond;    -   each X², X³, X⁷ and X⁸ is independently an amino acid residue,        preferably an amino acid residue selected from the group        consisting of aromatic amino acid residues and hydrophilic amino        acid residues, more preferably a hydrophilic amino acid residue,        and still more preferably an aspartate residue;

-   X⁵ is a bond or a spacer domain comprising at least one amino acid,    if other than a bond, preferably a histidine residue, a glycine    residue or a combination of multiple or alternating histidine    residues, said combination comprising His-Gly-His, or -(His-X)_(m)—,    wherein m is 1 to 6 and X is selected from the group consisting of    Gly, His, Tyr, Trp, Val, Leu, Ser, Lys, Pe, Met, Ala, Glu, Ile, Thr,    Asp, Asn, Gln, Arg, Cys, and Pro;    -   -   X⁹ is a bond or an aspartate residue; and        -   n is at least 2.

In this embodiment, the amino acid sequence X20-(D-Y—K—X²—X³-D)_(n),represents the multiple copies of the antigenic domain D-Y—K—X²—X³-D intandem which are joined to a linking sequence (D-Y—K—X⁷—X⁸-D-X⁹—K). Inthis embodiment, one antigenic domain is immediately adjacent to anotherantigenic domain, i.e., no intervening spacer domains, and the multiplecopies of the antigenic domain are immediately adjacent to the linkingsequence when X⁵ is a bond. The linking sequence contains a singleenterokinase cleavable site which is represented by the sequence—X⁷—X⁸-D-X⁹—K, where X⁷ and X⁸ may be a bond or an amino acid residue,preferably an aspartate residue, and X⁹ is a bond or an aspartateresidue. In a preferred embodiment, each X⁷, X⁸ and X⁹ is independentlyan aspartate residue thus resulting in the enterokinase cleavable siteDDDDK which is preferably adjacent to the amino terminus of the targetpeptide. Optionally, the multiple copies of the antigenic domain arejoined to the linking sequence by a spacer X⁵ when X⁵ is at least oneamino acid residue. When X⁵ is a spacer domain, it is preferred that theamino acid residue(s) of X⁵ impart one or more desired properties to theidentification polypeptide; for example, the amino acids of the spacerdomain may be selected to impart a desired folding to the identificationpolypeptide thereby increasing accessibility to the antibody. In anotherpreferred embodiment, the amino acids of the spacer domain may beselected to impart a desired affinity characteristic such as acombination of multiple or alternating histidine residues capable ofchelating to an immobilized metal ion on a resin or other matrix.Furthermore, these desired properties may be designed into other areasof the identification polypeptide; for example, the amino acidsrepresented by X² and X³ may be selected to impart a desired peptidefolding or a desired affinity characteristic for use in affinitypurification.

When the identification polypeptide is located at the amino terminus ofthe target peptide, it is desirable to design the amino acid sequence ofthe identification polypeptide such that an initiator methionine ispresent. Accordingly, in a preferred embodiment of the presentinvention, the identification polypeptide comprises multiple copies ofan antigenic domain, a linking sequence containing a single enterokinasecleavage site and generally corresponds to the sequence:

-   -   X²⁰—X¹⁰-(D-Y—K—X²—X³-D)_(n)—X⁵-(D-Y—K—X⁷—X⁸-D-X⁹—K)—X²¹    -   where:        -   D, Y, and K are their representative amino acids;        -   X²⁰ and X²¹ are independently a hydrogen or a bond;        -   X¹⁰ is a bond or an amino acid, if other than a bond,            preferably a methionine residue;        -   each X², X³, X⁷ and X⁸ is independently an amino acid            residue, preferably an amino acid residue selected from the            group consisting of aromatic amino acid residues and            hydrophilic amino acid residues, more preferably a aspartate            residue;        -   X⁵ is a bond or a spacer domain comprising at least one            amino acid, if other than a bond, preferably a histidine            residue, a glycine residue or a combination of multiple or            alternating histidine residues, said combination comprising            His-Gly-His, or -(His-X)_(m)—, wherein m is 1 to 6 and X is            selected from the group consisting of Gly, His, Tyr, Trp,            Val, -Leu, -Ser, -Lys, -Phe, -Met, Ala, Glu, Ile, Thr, Asp,            Asn, Gln, Arg, Cys, and Pro;        -   X⁹ is a bond or an aspartate residue; and    -   n is at least 2.

In this embodiment, the amino acid sequence X²⁰-(D-Y—K—X²—X³-D)_(n),represents the multiple copies of the antigenic domain D-Y—K—X²—X³-D intandem which is flanked by a linking sequence (D-Y—K—X⁷—X⁸-D-X⁹—K) andan initiator amino acid X¹⁰, preferably methionine. The antigenic domainD-Y—K—X²—X³-D with an initiator methionine is recognized by the M5antibody. In this embodiment, one antigenic domain is immediatelyadjacent to another antigenic domain, i.e., no intervening spacerdomains, and the multiple copies of the antigenic domain are immediatelyadjacent to the linking sequence when X⁵ is a bond. The linking sequencecontains an enterokinase cleavable site which is represented by theamino acid sequence —X⁷—X⁸-D-X⁹—K, where X⁷ and X⁸ may be a bond or anamino acid residue, preferably an aspartate residue, and X⁹ is a bond oran aspartate residue. In a preferred embodiment, each X⁷, X⁸ and X⁹ isindependently an aspartate residue thus resulting in the enterokinasecleavable site DDDDK (SEQ ID NO: 3) which is preferably adjacent to theamino terminus of the target peptide. Optionally, the multiple copies ofthe antigenic domain are joined to the linking sequence by a spacerdomain X⁵ when X⁵ is at least one amino acid residue. When X⁵ is aspacer domain, it is preferred that the amino acid residue(s) of X⁵impart one or more desired properties to the identification polypeptide;for example, the amino acids of the spacer domain may be selected toimpart a desired folding to the identification polypeptide therebyincreasing accessibility to the antibody. In another preferredembodiment, the amino acids of the spacer domain may be selected toimpart a desired affinity characteristic such as a combination ofmultiple or alternating histidine residues capable of chelating to animmobilized metal ion on a resin or other matrix. Furthermore, thesedesired properties may be designed into other areas of theidentification polypeptide; for example, the amino acids represented byX² and X³ may be selected to impart a desired peptide folding or adesired affinity characteristic for use in affinity purification.

In another embodiment of the present invention, the identificationpolypeptide comprises multiple copies of an antigenic sequence, alinking sequence containing a single enterokinase cleavable site andgenerally corresponds to the sequence:

-   -   X²⁰-(D-X—Y—X¹²—X¹³)n-X¹⁴-(D-X¹¹—Y—X¹²—X¹³-D-X—K)—X²¹    -   where:        -   D, Y and K are their representative amino acids;        -   X²⁰ and X²¹ are independently a hydrogen or a bond;        -   each X¹¹ is a bond or an amino acid, preferably        -   each X¹² is an amino acid, preferably selected from the            group consisting of aromatic amino acid residues and            hydrophilic amino acid residues, more preferably a            hydrophilic amino acid residue, and still more preferably an            aspartate residue;    -   each X¹³ is a bond or at least one amino acid, if other than a        bond, preferably selected from the group consisting of aromatic        amino acid residues and hydrophilic amino acid residues, more        preferably a hydrophilic amino acid residue, and still more        preferably an aspartate residue;        -   X¹⁴ is a bond or a spacer domain comprising at least one            amino acid, if other than a bond, preferably a histidine            residue, a glycine residue or a combination of multiple or            alternating histidine residues, said combination comprising            His-Gly-His, or -(His-X)_(m)—, wherein m is 1 to 6 and X is            selected from the group consisting of Gly, His, Tyr, Trp,            Val, Leu, Ser, Lys, Phe, Met, Ala, Glu, Ile, Thr, Asp, Asn,            Gln, Arg, Cys, and Pro;        -   X¹⁵ is a bond or an aspartate residue; and        -   n is at least 2.

In this embodiment, the amino acid sequence X20-(D-X¹¹—Y—X¹²—X¹³)_(n)represents the multiple copies of the antigenic domain D-X¹¹—Y—X¹²—X¹³in tandem which are joined to a linking sequence(D-X¹¹—Y—X¹²—X¹³-D-X¹⁵—K). Additionally, one antigenic domain isimmediately adjacent to another antigenic domain, i.e., no interveningspacer domains, and the multiple copies of the antigenic domain areimmediately adjacent to the linking sequence when X¹⁴ is a bond. Thelinking sequence contains a single enterokinase cleavable site which isrepresented by the sequence —X¹²—X¹³-D-X¹⁵—K where X¹² and may be a bondor an amino acid residue, preferably an aspartate residue, and X¹⁵ is abond or an aspartate residue. In a preferred embodiment, each X¹², X¹³and X¹⁵ is independently an aspartate residue thus resulting in theenterokinase cleavable site DDDDK (SEQ ID NO: 3) which is preferablyadjacent to the amino terminus of the target peptide. Optionally, themultiple copies of the antigenic domain are joined to the linkingsequence by a spacer X¹⁴ when X¹⁴ is at least one amino acid residue.When X¹⁴ is a spacer domain, it is preferred that the amino acidresidue(s) of X¹⁴ impart one or more desired properties to theidentification polypeptide; for example, the amino acids of the spacerdomain may be selected to impart a desired folding to the identificationpolypeptide thereby increasing accessibility to the antibody. In anotherpreferred embodiment, the amino acids of the spacer domain X¹⁴ may beselected to impart a desired affinity characteristic such as acombination of multiple or alternating histidine residues capable ofchelating to an immobilized metal ion on a resin or other matrix.

Target Peptide

In accordance with the present invention, the target peptide may becomposed of any proteinaceous substance that can be expressed intransformed host cells. Accordingly, the present invention may bebeneficially employed to produce substantially any prokaryotic oreukaryotic, simple or conjugated, protein that can be expressed by avector in a transformed host cell. Such proteins include enzymes,whether oxidoreductases, transferases, hydrolases, lyases, isomerases orligases.

The present invention also contemplated the production of storageproteins, such as ferritin or ovalbumin or transport proteins, such ashemoglobin, serum albumin or ceruloplasmin. Also included are the typesof proteins that function in contractile and motile systems, forinstance, actin and myosin.

The present invention also contemplates the production of proteins thatserve a protective or defense function, such as the blood proteinfibrinogen. Other protective proteins include the binding proteins, suchas antibodies or immunoglobulins that bind to and thus neutralizeantigens.

The protein produced by the present invention also may encompass varioushormones such as Human Growth Hormone,

somatostatin, prolactin, estrone, progesterone, melanocyte, thyrotropin,calcitonin, gonadotropin and insulin. Other such hormones include thosethat have been identified as being involved in the immune system, suchas interleukin 1, interleukin 2, colony stimulating factor,macrophage-activating factor and interferon.

The present invention is also applicable to the production of toxicproteins, such as ricin from castor bean or gossypin from cottonlinseed.

Proteins that serve as structural elements may be produced by thepresent invention; such proteins include the fibrous proteins collagen,elastin and alpha-keratin. Other structural proteins includeglyco-proteins, virus-proteins and muco-proteins.

In addition to the above-noted naturally occurring proteins, the presentinvention may be employed to produce synthetic proteins definedgenerally as any sequence of amino acids not occurring in nature.

Genes coding for the various types of protein molecules identified abovemay be obtained from a variety of prokaryotic or eukaryotic sources,such as plant or animal cells or bacteria cells. The genes can beisolated from the chromosome material of these cells or from plasmids ofprokaryotic cells by employing standard, well-known techniques. Avariety of naturally occurring and synthesized plasmids having genescoding for many different protein molecules are not commerciallyavailable from a variety of sources. The desired DNA also can beproduced from mRNA by using the enzyme reverse transcriptase. Thisenzyme permits the synthesis of DNA from an RNA template.

Preparation of DNA Expression Vectors

In accordance with the present invention, once a gene coding for atarget peptide is isolated, synthesized or otherwise obtained, it isjoined to a synthetic DNA fragment coding for the identificationpolypeptide.

The identification polypeptide gene may be synthesized by well-knowntechniques. For a chosen composition of the identification polypeptide,DNA oligmers encoding for the desired amino acids of the identificationpolypeptide may be synthesized using a commercially available, automatedDNA synthesizer in a manner well known in the art. The techniques andapparatus for synthesizing DNA are common and known in the art; thus,the description and detail to perform this will not be completely setforth herein. Essentially, this process involves obtaining pairs ofsynthetic oligonucleotides and digesting them with the appropriaterestriction endonucleases. This will produce the correct nucleotidesequence encoding for the identification polypeptide tag. Afterdigestion, various DNA fragments are formed with cohesive or “stickyends.” Although there may be many ways in which to perform suchconstruction, the preferred embodiment involves the generation ofmultiple FLAG® epitope sequence or variations thereof in tandem.

The pair of oligonucleotides used in the construction of theidentification polypeptide may be naturally occurring or syntheticallygenerated. It is generally preferred that the specific pairs ofoligonucleotides have been synthetically generated to produce the aminoacid sequence of the desired identification polypeptide tag. The strandsof each oligonucleotide are annealed together and digested with anappropriate restriction endonuclease such as EcoR I and Hind III. Afterdigestion and the creation of the nucleotide cassettes, the sequencescan be verified through DNA sequencing.

As discussed below, the synthetic DNA oligmers encoding for theidentification polypeptide may be ligated to a DNA sequence encoding forthe desired protein and then the combined DNA fragments ligated to anappropriate expression vector to form a cloning vehicle fortransformation to an appropriate host cell.

In addition to the target peptide gene and the identificationpolypeptide gene, if needed, the hybrid DNA fragment may include aribosome binding site for high level protein translation in a host cell,a translation initiation codon (ATG), and a promoter.

Generally, the genes coding for the target peptide and theidentification polypeptide ideally are treated with an appropriaterestriction enzyme or are otherwise manipulated to have cohesive terminito facilitate ligation with each other and with a plasmid or other typeof cloning vector. The cloning vector is preferably digested with thesame restriction endonuclease used to condition the foreign genes inorder to form complementary cohesive termini, (i.e., “sticky ends,”)prior to ligation with the foreign genes. Alternatively, the use ofcertain restriction enzymes (e.g., Pvu II, Bal I) may result in theformation of termini without complementary overhanging sequences,commonly referred to as “square” or “blunt ends.” The square ends of theplasmid can be joined to the foreign genes with an appropriate ligase.Additionally, various techniques may be used to manipulate the nucleicacids of the blunt ends to form cohesive termini, for instance, linkermolecules may be used to add nucleotide bases or appropriate enzymes maybe used to remove nucleotide bases from the flush ends. Methods andmaterials for achieving this are well known in the art.

PCR is also an effective tool for cloning known genes (into blunt orsticky sites). Primers can code for 25-40 bases of known sequence andthe resulting PCR product can be cloned into a digested vector havingblunt ends by removing any possible 3′ overhangs with T4 DNA polymerase.Another method of linking sequences with the use of the PCR reaction isto create restriction sites at the end(s) of the amplified DNA. Theserestriction sites are easily added to the 5′ ends of the primers usedfor amplification. Digestion of the purified PCR products will produceends for ligation to other DNA having compatible termini.

It is to be appreciated that digestion of the chosen plasmid with arestriction endonuclease(s) may result in the formation of two or morelinear DNA segments. The segment to be used to form the cloning vector,i.e., the segment having the phenotypic identity gene, replicon and theother desired components, may be identified by well-known techniques,such as by gel electrophoresis.

The resulting cloning vector is used to transform a host microorganism.The transformants are isolated and analyzed for the presence of theforeign genes and for the proper orientation of the genes within thevector. The transformants are then multiplied in culture to causereplication of the vector and high level expression of the hybridpolypeptide being sought. In addition, the cloning vectors may be usedto transform other strains of the chosen host or other types of hostsfor large scale production of the hybrid heterologous polypeptide.Various procedures and materials for preparing recombinant vectors,transforming host cells with the vectors, replicating the vector andexpressing polypeptide and proteins are discussed by Old and Primrose,Principles of Gene Manipulation, (2d Ed. 1981).

To carry out the present invention, various cloning vectors may beutilized. Although the preference is the used of a plasmid, the vectormay be a bacteriophage or cosmid. If cloning takes place in mammalian orplant cells, viruses can be used as vectors. If a plasmid is employed,it may be obtained from a natural source or artificially synthesized.The particular plasmid chosen should be compatible with the particularcells serving as the host, whether a bacteria such as Escherichia coli(E. coli), yeast, or other unicellular microorganism. The plasmid shouldhave the proper origin of replication (replicon) for the particular hostcell chosen.

In addition, the size of the plasmid must be sufficient to accommodatethe hybrid genes coding for both the target peptide and theidentification polypeptide, but also of as low a molecular weight aspossible. Low molecular weight plasmids are more resistant to damagefrom shearing and are more readily isolated from host cells. If obtainedfrom natural sources, they are usually present as multiple copies,thereby facilitating their isolation. Also, there is less likelihoodthat a low molecular weight plasmid has multiple substrate sites forrestriction endonucleases.

Another requirement for a plasmid cloning vector is the presence ofrestriction sites so that appropriate of restriction enzymes can cleavethe plasmid for subsequent ligation with the foreign genes withoutcausing inactivation of the replicon. To this end, it would be helpfulfor the plasmid to have single substrate sites for a large number ofrestriction endonucleases.

As stated above, there may be intervening amino acid spacer domainsbetween the multiple antigenic domains of the identificationpolypeptide. By varying the triplet DNA sequence representing specificamino acids (i.e., codons) in the design of these spacer domains, it ispossible to create multiple restriction enzyme sites for enzymes thatrecognize and cleave those designed sequences without changing the aminoacid sequence of the encoded identification polypeptide. The use ofsequences encoding recognition sites for restriction enzymes having aminimum of 6 bases in the recognition site is preferred thus reducingthe chance that multiple restriction sites will be present in both theDNA vector and the DNA sequences encoding the target peptide.

Likewise, a linking sequence is used to join the DNA sequences encodingfor the target peptide to the DNA sequences encoding the multipleantigenic domains of the identification polypeptide. By varying thetriplet DNA sequence representing specific amino acids in the design ofthe linking sequence, it is possible to create restriction sites forenzymes that recognize and cleave those designed sequences withoutchanging the amino acid sequence of the encoded identificationpolypeptide. The use of sequences encoding recognition sites forrestriction enzymes having a minimum of 6 bases in the recognition siteis preferred; this reduces the chance that multiple restriction enzymecleavable sites will be present in both the vector and the sequencesencoding the target peptide.

Moreover, the plasmid should have a phenotypic property that will enablethe transformed host cells to be readily identified and separated fromcells which do not undergo transformation. Such phenotypic selectiongenes can include genes providing resistance to a growth inhibitingsubstance, such as an antibiotic. Plasmids are not widely available thatinclude genes resistant to various antibiotics, such as tetracycline,streptomycin, sulfa drugs, penicillin, and ampicillin. When host cellsare grown in a medium containing one of these antibiotics, onlytransformants having the appropriate antibiotic resistance gene willsurvive.

Rather than utilizing a gene resistance to a growth inhibiting compoundto identify transformed host cells, phenotypic selection genes can alsoinclude those that provide growth factor to permit transformed cells topropagate in a medium which lacks the necessary growth factor for thehost cells. For instance, for yeast auxotrophs, such growth factorsinclude tryptophan or leucine.

Alternatively, it is preferred that a DNA sequence encoding a signalpeptide be joined to the sequences encoding the identificationpolypeptide and the target peptide. The use of a secreted signalsequence will also enable the transformed host cells to be readilyidentified and separated from cells which do not undergo transformation.Secretion signals are relatively short in most species, generallycomprised of 16-40 amino acids. Additionally, signal sequences frombacterial or eukaryotic genes are highly conserved in terms of function.Although the DNA sequences encoding for these signal peptides are nothighly conserved, many of these signal sequences have been shown to beinterchangeable. See Grey, G. L. et al., Gene 39:247 (1985).

Transformation of the Recombinant Plasmid

Once a suitable DNA vector encoding the desired hybrid polypeptide hasbeen constructed, the vector is introduced into the desired host cell.Although the host cell may be any appropriate prokaryotic or eukaryoticcell, preferably it is a well-defined bacteria, such as E. coli or ayeast strain. Both such hosts are readily transformed and capable ofrapid growth in fermentation cultures. In place of E. coli, otherunicellular microorganisms can be employed, for instance fungi andalgae. In addition, other forms of bacteria such as salmonella orpneumococcus may be substituted for E. coli. Whatever host is chosen, itshould be one that does not contain a restriction enzyme that wouldcleave the recombinant plasmid and that has the necessary biochemicalpathways for phenotypic expression and other functions for properexpression of the hybrid polypeptide.

DNA molecules are transfected into prokaryotic and eukaryotic hostsusing standard protocols known in the art. Briefly, the prokaryotic hostcells are made competent by treatment with calcium chloride solutions(competent bacteria cells are commercially available and are easily madein the laboratory). This treatment permits the uptake of DNA by thebacterial cell. Another means of introducing DNA into bacterial cells iselectroporation in which an electrical pulse is used to permit theuptake of DNA by bacterial cells. Likewise, standard protocols such ascalcium phosphate-DNA co-precipitation, DEAE-dextran-mediatedtransfection, electroporation, microinjection, lipofection, protoplastfusion, retroviral infection, particle bombardment (e.g., biolistics)are commonly used for the introduction of DNA molecules into eukaryotichosts, including yeast and higher eukaryotes.

In transformation protocols, only a small portion of the host cells areactually transformed, due to limited plasmid uptake by the cells. Thus,before transformants are isolated, the host cells used in thetransformation protocol typically are multiplied in an appropriatemedium. The cells that actually have been transformed can be identifiedby placing the original culture on agar plates containing a suitablegrowth medium containing the phenotypic identifier, such as anantibiotic. Only those cells that have the proper resistant gene willsurvive. Cells from the colonies that survive can be lysed and then theplasmid isolated from the lysate. The plasmid thus isolated can becharacterized to determine if the cointegrate genes are ligated in thecorrect orientation, by digestion with restriction endonucleases andsubsequent gel electrophoresis or both other standard methods. Oncetransformed cells are identified, they can be multiplied by establishedtechniques, such as by fermentation. In addition, the recovered clonedrecombinant plasmids can be used to transform other strains of bacteriaor other types of host cells for high scale replication and expressionof the hybrid polypeptide.

Purification of Hybrid Polypeptide

The hybrid polypeptide molecules expressed by the transformed host cellsare separated from the culture medium, other cellular material, etc.preferably by an affinity chromatography process. To this end,antibodies against the antigenic domains of the identificationpolypeptide of the hybrid polypeptide must be generated for use on acolumn matrix. To produce such antibodies, the identificationpolypeptide is first synthesized and then used to immunize anappropriate animal for production of an antibody against theidentification polypeptide. Such methods for producing antibodies aretaught in U.S. Pat. No. 4,851,341, incorporated herein by reference. Theantibody can be identified by an enzyme-linked immunosorbent assay(ELISA) or other appropriate assay. A monoclonal then can be produced byhybridoma techniques. Preferred antibodies are the FLAG® monoclonalantibodies M1, M2 and M5. After purification, the antibody or antibodiesare bound to the column matrix and then an extract from the transformedhost cells applied to the column to isolate the hybrid polypeptide. Thehybrid polypeptide is eluted from the column, for instance, bycompetition from free identification polypeptide.

Additionally, if the identification polypeptide contains histidine,glycine or combinations of multiple or alternating histidine residues,Immobilized Metal Ion Affinity Chromatography (IMAC) may be used as analternative method to isolate and purify target peptides. When a hybridpolypeptide containing the target peptide and the identificationpolypeptide is produced and passed through a column containingimmobilized metal ions, the hybrid polypeptide will chelate immobilizedmetal ions. The hybrid polypeptide should chelate to the immobilizedmetal ions for a sufficient amount of time to allow it to be separatedfrom other materials. Once the hybrid polypeptide is bound to the metalion resin, the hybrid polypeptide may be released by protonation of itsassociated metal ion-binding ligand. Dissociation is achieved bylowering the pH of the surrounding buffer medium, a common method knownin the art for eluting bound proteins. The target peptide may then becleaved from the identification polypeptide as further discussed herein.

Other methods may be used to detect, monitor or isolate target peptides.Such methods include immunoprecipitation and Western blotting as aredescribed in “Principles and Practice of Immunoassay,” Price and Newman,eds., Stochton Press, 1991. The use of immunoprecipitation as asensitive and specific technique to detect and quantitate target antigenin mixtures of proteins is known to one skilled in the art. SeeMolecular Cloning, A Laboratory Manual, 2d Edition, Maniatis, T. et al.eds. (1989) Cold Spring Harbor Press. Briefly, antibodies, preferablyFLAG® monoclonal antibodies, M1, M2 or M5 capable of binding to theantigenic domains of the identification polypeptide may be used todetect the proteins using immunoprecipitation tests. As described above,cells are transformed with the identification polypeptide, grown inculture media, and lysed to obtain a solution of tagged proteinaceousmaterial produced by the cells. This solution is incubated with asolution of monoclonal antibodies, and any complex betweenidentification polypeptide labeled protein formed in the cell and theantibodies are determined by precipitation. The protein/antibody complexcan then be isolated from the precipitate. The presence of the labeledprotein is then confirmed by usual analytical methods, e.g., SDSpolyacrylamide gel electrophoresis with fluorography, under conditionsdissociating the protein/antibody complex.

Additionally, Western blotting is another immunoassay technique used todetect the target peptide. Generally, small quanities of a targetpeptide are electrophoresed on a polyacrylamide gel and transferred (byblotting) to a polymer sheet or membrane. The membrane is then incubatedwith a first antibody, preferably a FLAG® monoclonal antibody which maybind to the antigenic domains of the identification polypeptide. Themembrane containing the antibody-antigen is then incubated with a secondlabeled antibody specific for the first antibody. The protein taggedwith the identification polypeptide may be detected and visualized byknown methods such as autoradiography.

Separation of Mature Protein From Purified, Hybrid IdentificationPolypeptide/Protein Molecules

Unless removed while still bound to the affinity column or matrix, theidentification polypeptide may be cleaved from the protein molecule andthe protein molecule separated from the identification polypeptide,thereby resulting in a purified protein. This is accomplished by firstsuspending the hybrid identification polypeptide/protein molecules inbuffer. Thereafter, the proteolytic enzyme or other chemical proteolyticagent that is specific for the amino acid residues composing the linkingportion of the identification polypeptide is added to the suspension.The enzyme may be coupled to a gel matrix to prevent contamination ofthe product solution with the enzyme. As discussed above, theproteolytic enzyme or chemical proteolytic agent cleaves the hybridpolypeptide between the adjacent amino acid residues of the linkingportion of the identification polypeptide and the protein molecule. Alsoas also noted above, as a nonlimiting example, the linking amino acidsmay be composed of the sequence: Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 3).This particular sequence of amino acids is only known to occur naturallyin the protein trypsinogen, the substrate for bovine mucosalenterokinase. Thus, by use of this particular amino acid sequence it ishighly unlikely that enzyme cleavage of the hybrid identificationpolypeptide protein molecules would also cause cleavage of the proteinmolecule itself.

After incubation, the desired protein is purified as follows. If theproteolytic agent is an enzyme attached to a gel matrix, the suspensionis centrifuged and the pellet (containing the enzyme-gel conjugate) isdiscarded. The supernatant contains only the protein product, thecleaved identification polypeptide and possibly small amounts ofuncleaved peptide/protein molecule, in addition to buffer salts. In thecase of chemical cleavage agents, there would be no gel centrifugationstep, and the solution would contain a residual chemical agent andby-products of the chemical agent in addition to the protein product,identification polypeptide and small amounts of uncleavedpeptide/protein molecule.

Most of the above-mentioned contaminating substances are much smallerthan the protein product and can be efficiently removed by simple means,such as gel filtration or dialysis. Only the uncleaved identificationpolypeptide/protein molecule would remain to contaminate the proteinproduct after such steps. To remove the polypeptide/protein moleculefrom the protein product, the mixture is passed over a second affinitycolumn, which column has attached to it the same antibody specific forthe identification polypeptide as was used for removal of thepeptide/protein molecule from the original production medium. Theantibody binds the unwanted polypeptide/protein molecule, and the eluatefrom the column contains only the desired product protein, now free ofall contaminants.

If a soluble enzyme is used for proteolytic cleavage, then the proteinproduct may contain small amounts of the enzyme, which can be removed bypassing the solution over an affinity column containing an immobilizedsubstrate for the enzyme. The enzyme is thereby bound to the column andthe desired protein molecules allowed to pass through.

As noted above, some protein products will possess the desired enzymaticactivity with the identification polypeptide still attached thereto. Asa consequence, the identification polypeptide need not be cleaved fromthe protein molecule, thus the above described cleave and subsequentpurification steps need not be performed.

Moreover, in situations in which the identification polypeptide remainsattached to the protein molecule, the linking portion of theidentification polypeptide is not needed. Instead, the identificationpolypeptide can be composed solely of the antigenic domains. In thissituation the construction and method of preparing the DNA expressionvectors, detailed above, can be appropriately modified.

The following examples are intended to illustrate but not limit thepresent invention.

EXAMPLES Example 1 p3XFLAG-CMV-7 Construction

Materials and Methods

Construction of P3XFLAG-CMV-7

P3XFLAG-CMV-7 was constructed from the mammalian expression vector,pCMV-5. The triple FLAG sequence was constructed from two pairs ofcomplimentary oligonucleotides. The first pair of oligonucleotides wassynthesized as follows: (SEQ ID NO: 8)5′ GAAGAATTCACCATGGACTACAAAGACCATGACGGTGATTATAAAGA TCATGAT 3′ and (SEQID NO: 9) 5′ ATCATGATCTTTATAATCACCGTCATGGTCTTTGTAGTCCATGGTGA ATTCTTC 3′.

The second pair was synthesized with the following sequence: (SEQ ID NO:10) 5′ GAAGATATCGATTACAAGGATGACGATGACAAGCTTGGG 3′ and (SEQ ID NO: 11)5′ CCCAAGCTTGTCATCGTCATCCTTGTAATCGATATCTTC 3′.

The first pair of oligonucleotides were annealed together and digestedwith EcoR I. The second pair of oligonucleotides were annealed togetherand digested with EcoR V and Hind III. The two pairs of digestednucleotide cassettes were ligated into CMV-5, which has been doubledigested with EcoR I and Hind III. The sequence was verified by DNAsequencing.

pFLAG-CMV7-BAP Construction

A modified version of the F. coli phoA gene for which the leadersequence and the N-terminal four amino acids of the mature enzyme weredeleted, was subcloned into the vector p3XFLAG-CMV-7. The modifiedsequence was cut from pFLAG-ATS-BAP by double digestion with Hind IIIand Bgl II. The fragment was then cloned into p3XFLAG CMV-7 which hadbeen double digested with Hind III and Bam HI to generatep3XFLAG-CMV-7-BAP. The nucleotide sequence at the N-terminus of the phoAcoding region was verified.

Triple FLAG-ATS-BAP Construction

Two oligonucleotides encoding the sense and anti-sense strand for thetriple FLAG sequence were synthesized,

5′phosphorylated with T4 polynucleotide kinase, and annealed together.PFLAG-ATS-BAP was digested with Nde I and Hind III and the vectorpurified by gel electrophoresis. The annealed cassette was ligated tothe double digested pFLAG-ATS-BAP vector with T4 DNA ligase and thereaction carried out overnight at 16° C. for 16 hours. The ligation wasenriched by digestion with Nru I and then transformed into E. coli DH5α.Clones were isolated and verified by sequencing.

Results

Applicants have constructed a vector for expression of proteins inmammalian host cells using a modified version of the FLAG expressionsystem, which contains 3XFLAG sequences in tandem (FIG. 1). Thisconstruct was designed to improve the detection limit of expressedproteins in mammalian host cells. The first two flag peptides aremodified FLAG® sequences. The original FLAG® epitope isAsp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 1) while the first two flagrecognition sequences is Asp-Tyr-Lys-Asp-His-Asp (SEQ ID NO: 12) witheither a Gly or Ile spacer domain between the two sequences. Thesealternative sequences arise from phage display studies in which adifferent binding motif was determined. See Miceli et al., J.Immunological Methods 167:279-287 (1994). This allows the introductionof additional FLAG® antibody binding sites without the addition of extraenterokinase recognition/cleavable sites.

The p3XFLAG-CMV-7 expression vector contains the human cytomegaloviruspromoter region necessary for constitutive expression of cloned genes inmany mammalian cell lines. The Kozak consensus sequence is provided inthe vector along with a multiple cloning site, which allows for avariety of cloning strategies. The multiple cloning site is compatiblewith the other existing CMV mammalian expression vectors. In additionthe expression vector contains the SV40 origin of replication forefficient high-level transient expression and a DNA segment from thehuman growth hormone containing transcriptional termination sequence andpolyadenylation signals. p3XFLAG-CMV-7 contains the β-lactamase gene forselection of the plasmid in E. coli.

Example 2 Bacterial Expression of p3XFLAG-ATS-BAP and ProteinPurification

Materials and Methods

E. coli BL21 (DE3) were transformed with the expression plasmidcontaining the triple FLAG BAP construct made according to the methodsof Example 1. Cells were grown in terrific broth containing 100 μg/mlampicillin at 37° C. with agitation. The culture was grown to anOD₆₀₀=4.0 and then induced with IPTG at a final concentration of 1 mM.The cell culture was grown for an additional 3 hours at 37° C. and thenharvested by centrifugation. The cell pellet was resuspended in 50 mMTris-HCl pH 8.0 and the cells disrupted by sonication and cellulardebris removed by centrifugation. The supernatant was applied to M2affinity gel was equilibrated with 50 mM Tris-HCl pH 8.0, 150 mM NaCl(TBS). The resin was washed with 20 bed volumes of TBS and then thetriple FLAG BAP was eluted with five column volumes of 0.1 M Glycine pH3.5. The eluted protein was pooled and adjusted to pH 7.5 with 1.0 MTris-HCl pH 8.0. Protein content was determined by both Bradford and byabsorbance using E₂₈₀=0.7 ml/mg.

Western Blot

Purified 3XFLAG-BAP and N-BAP were diluted with to 2× Laemlli buffer,boiled for five minutes and then placed on ice. Samples were resolved ona 15% SDS-PAGE using the method of Laemlli (Laemli, V., Nature,227:680-685 (1970)) and then transferred to nitrocellulose membranes.The membrane was blocked with phosphate buffered saline containing 3%non-fat dry milk for 1 hour and then rinsed three times in TBS, 0.05%Tween 20 (TBS-T). The membrane was incubated with M2 antibody at a finalconcentration of 10 μg/ml for 30 minutes in TBS-T and then rinsed threetimes in TBS-T. The membrane was then incubated for 30 minutes with agoat anti-rabbit IgG (whole molecule) Horseradish peroxidase(HRP)conjugate diluted 1:10,000 in TBS-T then rinsed three times in TBS-T.The FLAG® tagged protein were detected with the HRP conjugates andvisualized by chemilumenescent detection using ECL (Amersham) and KodakX-Omat MR film according to manufacturer's directions with exposuresfrom 1 to 30 minutes.

Results

To address whether a triple FLAG fusion protein produces a moresensitive response then the traditional FLAG® epitope, a triple FLAGversion of bacterial alkaline phosphatase was constructed for expressionin E. coli. The vector p3XFLAG-ATS-BAP was transformed into E. coli andthe 3XFLAG-BAP expressed and purified as described in material andmethods. In addition, an N-FLAG-BAP containing the traditional FLAG®epitope (DYKDDDDK) (SEQ ID NO: 1) was also expressed and purified.Comparison of the sensitivity of the single versus the triple FLAG-BAPwas demonstrated by western blot analysis as described above. FIG. 4shows the western blot of purified single and triple flag probed with M2antiFLAG antibody and detected by chemiluminescence. The results clearlyindicate that there is a 10-fold increase in detection limit of thetriple FLAG-BAP compared to the single FLAG-BAP fusion protein.Applicants were able to detect 500 picograms of purified 3× Flagbacterial alkaline phosphatase with exposures as short as 1 minute. Withincreased exposure time, detection as low as 100 picograms has beenachieved but with increased background. Applicants have alsodemonstrated at least a 10-fold increase detection in both dot blot andELISA assay.

Example 3 Expression of 3× Bacterial Alkaline Phosphotase in COS-7 Cells

Materials and Methods

Transfection of COS-7 Cells with p3XFLAG-CMV-7-BAP

COS-7 cells were cultured on 35 mm² plates in Dulbecco's Modified EaglesMedium (DME), containing 10% fetal bovine serum, 4 mM L-glutamine, 5μg/ml gentamycin. Cells were grown at 37° C. in a humidified CO₂incubator with 5% CO₂. Transfection of the p3XFLAG-CMV7-BAP plasmid wasaccomplished using Lipofectamine (Life Technologies Inc., Gaithersburg,Md.) according manufacturer's directions. Two micrograms of vector DNAwas used for the transfection. Immunostaining was done 72 hour posttransfection.

Immunostaining

At 72 hours post induction, the cells were washed with 50 mM Tris-HCl pH7.4, 150 mM NaCl (TBS). The cells were fixed with 1:1 (v/v)methanol-acetone mixture for 1 minute. The fixed cells were washed fourtimes with TBS and then incubated with 10 g/ml M2 antibody-HRP conjugatein TBS for 1 hour. Cells were washed with TBS five times and the M2antibody-HRP conjugate visualized with freshly prepared 0.01 mg/mlo-dianisidine, 0.015% hydrogen peroxide in TBS. Cells were stained forapproximately 15 minutes.

Results

-   -   p3XFLAG-CMV-7-BAP (FIG. 2) was transfected into COS-7 cells as        described in the materials and methods. At 72 hours post        transfection, the cells were analyzed by immunostaining using an        anti-FLAG M2 HRP conjugate. Light microscopy of cells detected        with M2 antibody, and visualized with o-dianisidine, is shown in        FIG. 3.        Discussion

Applicants have created a mammalian expression plasmid containingmultiple FLAG® epitopes in tandem, p3× FLAG CMV-7, designed forintracellular expression with increased sensitivity of detection. Thisvector contains the cytomegalvirus (CMV) promoter and SV40 origin ofreplication for efficient expression in COS-7 cells. Moreover, detectionof triple FLAG tagged BAP expressed and purified from E. coli wascompared to single FLAG® tagged BAP.

The FLAG® epitope tag has been effectively used to detect and purifyprotein in mammalian and bacterial systems. Applicants have demonstratedthat the presence of three FLAG epitopes significantly increases thedetection limit of purified bacterial alkaline phosphatase. Moreover,the 3XFLAG-BAP cannot be eluted from anti-FLAG M2 affinity gel bycompetition with the original FLAG® peptide. However, 3XFLAG-BAP and the1× FLAG-BAP can be competitively eluted from the anti-FLAG M2 affinitygel using 3XFLAG peptide. The p3XFLAG-CMV-7 vector was designed forexpression and detection of heterologous proteins in mammalian cells andis compatible with existing pFLAG-CMV vectors thus allowing for easysubcloning between vectors containing the single FLAG® and the tripleFLAG. The immunostaining results show that expression of the phoA genein COS-7 cells is not significantly perturbed by addition of the 3XFLAGsequence.

The M2 antibody reacts with the alternate FLAG® in the 3XFLAG sequence.In contrast, M5 antibody fails to show the increased sensitivity thatthe M2 antibody demonstrates. Recent results using phage display havedemonstrated that the critical residues for M2 binding and M5 bindingare slightly different. M2 antibody prefers the sequenceAsp-Tyr-Lys-XXX-XXX-Asp-XXX-XXX (SEQ ID NO: 13) while M5 prefersAsp-Tyr-XXX-XXX-Asp-Asp-XXX-XXX (SEQ ID NO: 14). The triple FLAGsequence Asp-Tyr-Lys-Asp-His-Asp (SEQ ID NO: 12) clearly favors thebinding of M2 over that of M5 or even M1 antibody.

Example 4 Analysis of the FLAG M2 Antibody Binding to Multiple FLAGepitopes

Materials and Methods

Thermodynamic analysis of the M2 antibody binding to the FLAG epitopeswas measured by isothermal titration calorimetry using an OMEGAcalorimeter (Microcal). All samples were dialyzed against PBS containing0.05% sodium azide and degassed prior to the measurements. Allmeasurements were made at 25° C. The concentration of M2 antibody wasbetween 15 and 50 μM depending upon which samples were used. Theconcentrations of the titrants were 605 μM for the 1×BAP, 1110 μM forthe 1× FLAG peptide, 400 μM for the 3× BAP, and 580 μM for the 3× FLAGpeptide. Injections were carried out every 2.5 to 3.0 minutes which wassufficient for baseline to be achieved with injection volumes rangingfrom 4 to 11 μL. Injections were carried out over a 4 to 10 secondperiod while stirring at 400 rpm.

Data analysis and fitting were performed using the Origin softwaresupplied by MicroCal. The enthalpies were obtained by numericalintegration of the data and subtraction of the heats of dilution. Valuesof Ka, n the number of binding sites were determined by fitting the datato a theoretical curve with only the enthalpy was held constant duringthe fitting process.

Results and Discussion

In the case of the 1× FLAG system Ka was small enough so that the valuer<1000 where r=KaMt(0) with Mt(0) is the initial concentration of M2antibody in cell. For the 3× FLAG system, the Ka was large enough suchthat r>1000 indicating a tight binding system and thus accuratemeasurements about the Ka can not be determined.

Applicants have demonstrated that placing three epitopes in tandemproduces an increase in the association constant that is well over anorder of magnitude larger then that of one epitope, as shown in Table 1.TABLE 1 1× BAP 1× Peptide 3× BAP 3× Peptide Ka 1.69E+07 1.17E+072.09E+08 3.66E+08

The values Ka for the single epitope peptide and the single epitope BAPare similar which allude to comparable binding mechanisms. For the threeepitope systems, both the peptide epitope and the epitopes on BAP valuesof Ka also suggesting comparable mechanisms. The increased level ofdetection of observed in the triple FLAG system is due primarily to anincrease in the association constant.

Other features, objects and advantages of the present invention will beapparent to those skilled in the art. The explanations and illustrationspresented herein are intended to acquaint others skilled in the art withthe invention, its principles, and its practical application. Thoseskilled in the art may adapt and apply the invention in its numerousforms, as may be best suited to the requirements of a particular use.Accordingly, the specific embodiments of the present invention as setforth are not intended as being exhaustive or limiting of the presentinvention.

1. An identification polypeptide for use in purifying a target peptidewherein said identification polypeptide comprises: a. multiple copies ofan antigenic domain joined together in tandem, each of the antigenicdomains comprising no more than twenty amino acid residues with at leasttwo different amino acid residues; and b. a linking sequence between themultiple copies of the antigenic domain and the target peptide molecule,the linking sequence comprising a cleavable site wherein the cleavablesite is not duplicated within or interposed between the multiple copiesof the antigenic domain.
 2. The identification polypeptide of claim 1wherein the amino acid sequence of each such antigenic domain comprisesat least one-half hydrophilic amino acid residues.
 3. The identificationpolypeptide of claim 2 wherein the amino acid sequence of each suchantigenic domain further comprises at least three-fourths hydrophilicamino acid residues.
 4. The identification polypeptide of claim 1wherein the amino acid sequence of each such antigenic domain comprisesat least one amino acid selected from the group of hydrophilic aminoacid residues and at least one amino acid selected from the group ofaromatic amino acid residues.
 5. The identification polypeptide of claim4 wherein the amino acid sequence of each such antigenic domaincomprises no more than ten amino acid residues with at least twodifferent amino acid residues.
 6. The identification polypeptide ofclaim 4 wherein the amino acid sequence of each such antigenic domaincomprises no more than six amino acid residues with at least twodifferent amino acid residues.
 7. The identification polypeptide ofclaim 1 wherein the amino acid sequence of each such antigenic domaincomprises a plurality of amino acids of the group consisting ofarginine, proline, glutamic acid, aspartic acid, phenylalanine andisoleucine.
 8. The identification polypeptide of claim 1 wherein saidcleavable site comprises an amino acid sequence being cleavable by asequence-specific proteolytic agent at a specific amino acid residueadjacent to the target peptide molecule, wherein the sequence-specificproteolytic agent is selected from the group consisting of enterokinase,Factor Xa and thrombin.
 9. The identification polypeptide of claim 8wherein said cleavable site is an enterokinase recognition site.
 10. Theidentification polypeptide of claim 1 further comprising a spacer domaincomprising at least one amino acid residue interposed between any two ormore antigenic domains of said multiple copies of the antigenic domainor between the multiple copies of the antigenic domain and the linkingsequence.
 11. The identification polypeptide of claim 10 wherein theamino acid sequence of said spacer domain is selected from the groupconsisting of hydrophilic amino acid residues.
 12. The identificationpolypeptide of claim 10 wherein said spacer domain further comprises atleast one histidine residue, at least one glycine residue or acombination of multiple or alternating histidine residues, saidcombination comprising His-Gly-His or -(His-X)_(m)—, wherein m is 1 to 6and X is selected from the group consisting of Gly, His, Tyr, Lys, Phe,Met, Ala, Glu, Ile, Thr, Asp, Asn, Gln, Arg, Cys, and Pro.
 13. Theidentification polypeptide of claim 10 wherein the amino acid sequenceof said spacer domain comprises isoleucine.
 14. The identificationpolypeptide of claim 1 further comprising a metal chelating sequencejoined to said multiple copies of the antigenic domain, wherein themetal chelating sequence comprises at least one histidine residue, atleast one glycine residue or a combination of multiple or alternatinghistidine residues, said combination comprising His-Gly-His or-(His-X)_(m)—, wherein m is 1 to 6 and X is selected from the groupconsisting of Gly, His, Tyr, Trp, Met, Asp, Asn, Gln, Arg, Cys, and Pro.15. The identification polypeptide of claim 1 further comprising amultiple cloning site comprising multiple restriction enzyme recognitionssites.